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'-^ . Abstract: It is difficult to detect and evaluate the number 

of communities in complex networks, especially when the 



situation involves with an ambiguous boundary between the 



c/2 . inner- and inter-community densities. In this paper, Dis- 

c/3 | crete Nodal Domain Theory could be used to provide a cri- 

• i— i ' terion to determine how many communities a network would 

have and how to partition these communities by means of 
. the topological structure and geometric characterization. By 



capturing the signs of certain Laplacian eigenvectors we can 
separate the network into several reasonable clusters. The 
method leads to a fast and effective algorithm with applica- 
| tion to a variety of real networks data sets. 
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1 Introduction 

With the surveys of many real- world networks, including the World-Wide Web [21 [22], metabolic 



$_i " networks [291 [23], epidemiology [361113], scientific collaborations and citation networks [38, 



plenty of models were proposed to study their topological features and dynamic behaviors, 
such as the WS model [13], BA model [1] and Random Configuration model [TO]. Recently, 
a particular and useful network structure, which is called "communities or clusters", has been 
appealed to considerable attention. For its having no precise definition yet, common description 
about communities is division of nodes into groups with dense connection inside and sparse 
connection outside, by which mean communities could play an important role as the basic 
functional components in the foundation of complex networks. 

Practically, part of scientists concern with nodes' or edges' individual behaviors affecting the 
surroundings [241 US] , while some others focus on the dynamics of ensembles of all the nodes 
in networks [31 E]. These two special emphasis clarifies two possible directions in community 
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detection: (1) detail exploitation: identifying nodes or edges whose absence influences networks' 
dynamic most, which mostly are the boundary parts; (2) global sights testing for partitions 
structure either close enough to the original network or distinct most from the corresponding 
random pattern network. 

It has been long accepted that centrality measures [Ml [201 EH EI] works well in characterizing 
relative importance of a node in a network [H HSl [25], as well as similar scales on edges. In 2002, 
a fast algorithm aiming at identifying each edge of the network a betweenness measure gave 
rise to a explosive growth of activities in this field. Girvan and Newman [27] used the scale to 
quantify edges' roles in the information transmission following paths of minimal length across the 
network. Removal of edges with high betweenness could lead to an exposure of the community 
structure. 

Besides these detail exploitation methods, scientists also take the network whole into account, 
at which point two extreme situations might be queried: how much the original network shifts 
from its corresponding un-clustered version and spontaneously the well-clustered one. 

Consideration on the first situation educed one of the most popular quality functions, so 
called modularity \39\ l4"l~l |4"2] . The basic thought was to compare the difference between total 
actual fractions of edges inside groups and the expected fractions when edges were placed at 
random. An improved version [41] was also developed to measure the difference between ac- 
tual network and null model which yielded networks that were not supposed to have natural 
community structures [37l [41] . 

Furthermore, a probabilistic framework embedded with 'stochastic matrices' provides an- 
other important quality function [161 I17j . By introducing a metric on the space of Markov 
chains K which represent random walks on the network [T7J, simple stochastic structure could 
be finally detected as the best approximation to the dynamics behaviors of K. This simple 
structure may contain the community information of the original network. 

Despite the algorithm complexity of NP-hard [M], improved approximation techniques in 
partition problems calculate 'optimized' partition under certain constraints [401 |4"T1 [28| 116] . 
however, the question about the number of communities is still not easy to be answered [41[ll7j. 

In this paper we introduce a method applying weak-nodal-domain partition (WNDP) which 
can suggest the number of clusters by exploring the information in the Laplacian eigenvectors. 
In Sec. II, we begin with a brief review of the spectral partition method and classical Nodal 
domain Theorem. In Sec. Ill, the main method and algorithm will be presented respectively. In 
Sec. IV, the algorithm is applied to three real network cases with fast and efficient results. In 
Sec.V, exceptional case is demonstrated for the further understanding of the method. 

2 Spectral Partition and Nodal Domains Theory 

The exactly mathematical definition of cluster has not been explicit so far, as though common 
agreement is focused on the minimization of edges whose disappearance will separate the net- 
work into groups with no inter-connection. A reasonable mathematical framework should be 
required to grasp the essential properties of the network not only precisely but also effectively. 
Particularly, quite a number of the clustering methods are involved with special matrices, for 
example, adjacency matrix, Graph Laplacian et al. All these matrices share the topological 
information of the networks ostensibly or inconspicuously. 

So far, surprising results have already been made clear that eigenvectors of special matrices 
do work well in clustering |11[ I35j. Scientists are interested in the projection from properties of 
these matrices to the corresponding networks' outperforming structure. In the following sections, 
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the eigenvectors of the Graph Laplacian £ will be served as a tool to detect and analyze the 
community structure of the networks. 



2.1 Spectral Partition 

Let G = (V,E) be an undirected graph with labeled vertex set V = {1,2,- •• , n}. As an 
unweighted graph, the adjacency matrix A = (Aij) is defined to be 



Aij 



1, if % ~ j, 
0, otherwise. 



Meanwhile, the unnormalized graph Laplacian matrix is defined to be C = D — A. Here, 
D = Diag(d\, ■ ■ ■ ,d n ) is the diagonal degree matrix where di = Y,™ =1 Aij. Another important 
concept is the cut size: 

Cut = \ £ A^, (1) 

i,j in 
different 
group 

Noting that the factor | was a compensation for the double count as Aij = Aji. It is the number 
of edges connecting different communities. Traditional way is to minimize the cut size under all 
the possible partition choices. 

Given a partition of V into k sets A%,A2, ■ ■ ■ ,Ak, we rewrite Eq.([T]) to a universal form: 



k 

Cut(A u --- ,A k ) = ^Y,Cut h (2) 



i=i 



where Cuti = S igA; jeA^j anc ^ A i * s ^ ne complement of A[ in V. 

We then set a n x k matrix S = (Sij) to indicate the positions of vertices in communities: 

{1, if i G An, 
(3) 
0, otherwise. 

Note that the columns Sj = (Sij,-- - ,S n j) T of S are mutually orthogonal, and the matrix 
satisfies normalization Tr(S T S) = n. Thus, 

Cut t = E_^ = EE A a EE A a ' ( 4 ) 

now put Eq.Q into a matrix form 

Cuti = y] DjSuSii — y~] AjjSuSji 

i ij 

= SfCSi, 

Hence, 

Cut{A x ,--- ,A k ) = ^J2 S f CS ' 

i 

1 



2 Tr(S J £S). (5) 
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So the problem of minimizing Cut{A\, ■ • ■ , A^) can be rewritten as 

- min Tr( S T CS) for all S in Eq.©. (6) 

2 A x ,-,A k 

Tr(S T S)=n 

Note that the optimization problem is based on matrix S whose entries' elements are only 
allowed to take special values or 1, which indicates complicated calculation in solving the issue. 
In order to find a possible solution, we make a relaxation on the discreteness condition and allow 
Sij to be arbitrary values in R. Hence, a general form is 

min Ti{S T £S). (7) 

Ai,-,A fe 

This is the standard form of a trace minimization problem, and the Rayleigh-Ritz theorem 
tells us the best minimization can be achieved by S containing the first k eigenvectors of C as 
its columns. Let us make use of the Laplacian eigenvectors to rewrite Eq.fJTJ). 

Since C is a positive semi-defined symmetric matrix, its eigenvalues are all real and non- 
negative. Thus, respectively, let Ai < A2 < • • • < A& • • • < A n be defined as the eigenvalues of 
C, as well as theirs corespondent normalized eigenvectors /1, • • ■ , f n - For each row of Laplacian 
matrix C, we have 

n n 

Aj = Da — ^2 Aij = 0. 

3=1 3=1 

This implies that (1, • • • , 1) T is the eigenvector of L with eigenvalue 0, so = Ai < A2 < • • • < 
Afe • • • < A n . 

Thus, £ = FT>F T , where the eigenvector matrix F = (f\ \ ■ ■ ■ \f n ) and T> is a diagonal matrix 
with T>a = Aj. 



Tr(5 T £5) = J^SfCSi 



1=1 

n k 

i=2 1=1 

n 

= E A ^i" ( 9 ) 

i=2 

where Wj = ^2f = i(fJ' Si) 2 , by which means the optimization has been split into pieces according 
to the eigenvalues {Xj} as shown in Eq.([8]) with the column-based operation on the eigenvector 
matrix F. Our course would be clear: to find a separation placing as much as possible of the 
weight Wj on the side of small eigenvalues while as little as possible on the large ones. In terms 
of Eq.Q, the properties of eigenvectors are of great influence in the processing, which leads to 
a further exploration into the eigenvectors space to help approximating the optimization. 

2.2 Nodal Domains by Eigenvectors 

Recall the basic characterization of the eigenvalues and eigenvectors 
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By multiplying both sides with vector /■ , we have 

where fj is normalized and Xj is the eigenvalue in terms of the Rayleigh quotient of C. Actually, 
for j 6 {2, • • • , n}, eigenvalues can be characterized as 

A 7 = inf f T Cf, (10) 

where Wj-i is the subspace spanned by the eigenvectors of the smallest j — 1 eigenvalue [S]. 
Simple calculation shows that for arbitrary / € R n 



f T Cf = £(/(«) - f{v)f 



which can be substituted into Eq. (fT0l) to form 

x i = M £tf («) - /(«)) 2 . ( n ) 

where Aj is achieved with / being the exact j-th corresponding eigenvector. 

It is easy to see that under condition / _L Wj—i, the eigenvector /,• is the weight function 
that minimizes the total weight difference between pairs of adjacent nodes through the whole 
network. Such a property indicates a structure in which nodes are more likely to appear in the 
same group if their corresponding elements in fj are close [34J. However, these values are widely 
distributed that there is no such a criteria to depict the boundary between groups. 

Let's look back upon Eq.([9]). Matrix S, which represents the partition, should be chosen to 
make Wj relatively large while j is small. This idea could offer a possible way to describe the 
criteria. For a single fj , the corresponding Wj reaches the maximization when nodes are grouped 
as follow: 

f x € Si, if fj(x) > 0, 
\xGS 2 , if/,(x)<0. 

Noting that set {x \fj(x) = 0} is not mentioned for its making no contribution to the maximiza- 
tion of Wj. However, the network structure is complicated that nodes in the same group Si or 
S2 might not be connected at all. See Figfj] for instance. 

Hence, instead of clustering nodes as in Eq. (112p . we add a natural constraint that each group 
should be a connected subgraph referring to the graph geometry structure. The square operation 
in Eq.([8j) suggests that two operations on subgraphs are preferred: (a) nodes within the same 
subgroup share the same sign in fj) (b) each subgroup contains as many nodes as possible. This 
consideration leads us to focus on partitions deduced according to the signs of each eigenvectors' 
elements. 

An interesting result named the Courant-Hilbert nodal theorem [7] gives a detail description 
about the domains cut by zeros of each eigenfunctions. 

Given the self-adjoint second order differential equation L[u] + Xpu = (p > 0) for a domain G 
with arbitrary homogeneous boundary conditions, if its eigenfunctions are ordered according to increasing 
eigenvalues, then the zeros of the n-th eigenfunction u n divide the domain into no more than n subdo- 
mains. No assumptions are made about the number of independent variables. [From Courant-Hilbert: 
Methods in Mathematical Physics.] 

It inspires us to take into account the corresponding discrete situation, which is the discrete 
nodal domain theorem on graph G = (V,E). We define a positive (negative) strong nodal 
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(a) 

Figure 1: The cockroach graph from Guattery and Miller [26] has 80 nodes as illustrated above 
with each suspension points representing a line of 16 nodes. Here, we apply the fourth eigenvector 
to get this special partition where black for nodes of negative eigenvector values, white for 
positive ones. It's easy to see that reaches its maximum when nodes with the same signs 
are in the same groups, while the graph structure tells us negative nodes are divided into three 
unconnected parts. 



domain of a function / on V(G) to be a maximal connected induced subgraph of G on vertices 
v £ V with f(v) > (/(f) < 0). Meanwhile, a positive (negative) weak nodal domain of a 
function / on V(G) is a maximal connected induced subgraph with f(v) > (f(v) < 0) that 
contains at least one nonzero valued node. The relation between eigenvectors and nodal domains 
was introduced and studied by Davies, Gladwell, Leydold, Stadler, et al. (6j[l2]. One of those 
important results is 

Theorem 2.1. [6] Let M is a symmetric matrix with nonnegative diagonal entries and M uv < 
as u ~ v. Let \\ < ■ ■ ■ < \k = ^k+i • • • = \k+ r -i < ^k+r < • • • < A n be eigenvalues of M and 
fk be the corresponding eigenvector of \k- Then, the number of strong nodal domains of ft is 
no more than k + r — 1, and the number of weak nodal domains of fk is no more than k. 

This theorem defines the matrix M in a general expression, in which the Laplacian of graph 
is included. It shows a possible natural structure framework in which we achieve a maximization 
of Wj satisfying the connectivity information. An interesting viewpoint occurs when there are 
zero elements in the eigenvectors. Weak nodal domains request a sharing of these nodes, which 
means an overlapping structure might be naturally defined, or certain basic functional parts in 
the dynamic system of the network are discovered. 

3 Partition by weak nodal domain 

Specially, eigenvector affording the second smallest eigenvalue of Laplacian of the network is 
called the Fiedler vector [18J. It has been applied in graph bi-partitioning [H] and spectral 
clustering [5]. Generally speaking, these methods provide partitions that attract rather large 
weight on the smallest nonzero eigenvalue in Eq.Q to make Cut(A\, • • • , Ak) small. However, 
bad results can be found upon many graphs in [26]. Take "cockroach graph" [26J for example, 
the second eigenvector offers a division horizontally cutting through the ladder while, obviously, 
the ideal cut is the long dots line (see FigI21(a)). 

The discrete nodal domain theorem indicates that the Fiedler vector decides weak nodal 
domains no more than two, but for the cockroach graph, the ideal cut separates the graph 
into three parts. Spontaneously, we are interested in the behavior of the third eigenvector. 
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(a) (b) 



Figure 2: The cockroach graph from Guattery and Miller [26], where black for negative eigen- 
vector nodes, white for positive eigenvector nodes, (a) the horizontal cutting is deduced by the 
second eigenvector, the dashes line implies the ideal separation; (b) partition deduced by the 
third eigenvector. 



Calculation shows that it provides the exact separation just as the ideal cut does (see Figj2j(b)). 

This result leads us back upon Eq.Q. Despite the weight expression Wj = Yl^iifJ ^i) 2 , 
eigenvalues also play an important role in the minimization of the formula. Actually cockroach 
graph possesses two eigenvalues A2 = 0.0057 and A3 = 0.0062 that are relatively close, compared 
with A4 = 0.0246 and other eigenvalues, which implies that heavy weight on A3 is also a possible 
optimized choice. Hence, we can apply the eigenvectors other than the Fiedler vector to partition 
and this is so called the weak-nodal-domain partition(WNDP). 

In most situations eigenvectors corresponding to smaller eigenvalues provide WNDP with 
smaller cut size, which makes multi-communities structure uneasy to be determined. To avoid 
this shortage, we take the famous quality function modularity Q |4Tj as a criterion. 

Q = Tr(S T (A — P)S). 

where P=(Pij) is the random configuration correspondence of A satisfying Pjj = |41j . 

This quality function depicts the difference between how many edges within communities 
and how many edges to be expected within communities. Interestingly, we have 

Q = Tr(S T ((D-P)-(D-A))S) 

= Tt(S t {D - P)S) -Tr(S T (D - A)S), 

where the first half represents expected edge number outside communities and the other half 
dedicates actual edge number of such set. Note that the second half is exact the cut size of the 
network mentioned before. We are interested in the difference between WNDP and the exact 
optimization result of modularity Q. 

Consider a WNDP, Q will be improved in three ways: 

(i) two communities merge together: volume to be V{ = Y^heA-^h an d Vj = J2heA dh, 
number of connecting edges to be Cj,-, this alteration only happens when 

Vi * Vj 
2\E\ <Cij ' 

(ii) one community split into two: with the same definition as above, this alteration only 
happens when 

Vi * Vj 



7 



(iii)move single vertex from one community to another: as vertex v with rtj connections to 
Ai and rij connections to Aj, it belongs to Ai to make better benefit on Q when 



Vi * rii Vj * n j 



< Hi — Uj. 



2\E\ 2\E 



The first two operations usually will not be needed if the community structure are relatively 
clear (see samples below), thus slight alterations on vertices may lead to a good approximation. 

This algorithm is practically under processing in the form of matrix calculation. Specially, 
given the function /, the complexity of finding the weak nodal domains is 0(|V|). 

Remember that there could be zero elements in the eigenvectors. Define a vertex v to be a 
zero vertex if f(v) = 0. Similarly a zero component is a maximal connected subgraph of zero 
vertices. We preprocess the graph in following steps: 

(i) Contract all zero components into single vertices which inherit the connection of the 
components. Here, multiple edges are allowed. This leads to graph G\. 

(ii) Split each zero vertex in G\, say v, into two connected individual vertices v + and v~ , 
in which v + inherits all neighbors of v with positive values in / and v~ inherits corresponding 
negative parts, (note that either all neighbors of v are zero vertices or all v has neighbors of 
different signs). This leads to graph G*2. 

We call the new graph G2 after the two steps a weak domain graph. It is the graph our 
algorithm be applied on. Note that the most complex part of the method is to calculate the 
eigenvectors of the sparse Laplacian matrix, which is known to be polynomial of order 0(n 2 ), 
and one can apply shifted power method to get the the k-th eigenvector which is of complexity 



Situation with zero components could be complicated, for that modularity is defined on 
structure without overlapping. In this paper, a limitation of the definition on zero value to be 
of order 0(1CP 17 ) will avoid overlapping structure. The ambiguous boundaries will be discussed 
in our further work. 

4 Experiments with WNDP 

The whole comparison framework define above allows us to test the best partition as well as 
reasonable number of clusters. Or more precisely, we are trying to find out what information the 
eigenvectors might suggest for a possible partition. Here, three different networks are conducted 
under our algorithm to test the WNDP method. 

4.1 Dolphin Graph 

The establishment of Dolphin graph [33] is based on certain group of bottlenose dolphins living 
in Doubtful Sound, New Zealand. The pattern that two dolphins getting alone together suggests 
certain relation among members of this representative animal social network. See Figj3]^b) for 
the detail connections. Scientists found an interesting phenomenon after years of observation: 
the whole group of dolphins split into two small subgroups following the departure of one key 
member named 'SN89'. This observed division is represented with colors in the figure. 

We process the graph with our algorithm, and the outcome indicates f2 possessing reasonable 
WNDP (see FigEJa)). Solid line in Figj3] depicts the division of the calculation result. Only 
'SN89' does not fit the real statement. The background of this division tells us 'SN89' is an 
ambiguous node for its role as a association between two subgroups. Thus, the result of our 
method is quite close to the reality. 
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4.2 Political Books Graph 

Political Books graph is assembled and studied by Krebs [32] • The database was taken from the 
online bookseller www.amazon.com, in which 105 books have been considered. Edges between 
books represent frequent co-purchasing by the same buyers. Krebs collected these information 
in 2004 around the US Election. He wanted to study the relation between books describing 
different political views. Naturally, this special graph inherits bilateral structure arising from 
the two different political tendency in the United States, democratic and republic respectively. 

It is not surprising that WNDP by /2 of Political Books graph reaches its optimum mod- 
ularity. FiglU^b) is the exact partition, in which each side has its particular groups of authors 
and readers. An intuitive survey of the original graph shows our prediction still has two books 
of liberal presented in the group of conservative. 

4.3 Capocci Graph 

Capocci Graph is a simple graph (as shown in Figj6] up-left) generated by Capocci for the 
application of eigenvector component to identify communities [Sj. The experiment shows that 
the second eigenvector of the right stochastic matrix, which is D~ 1 A, indicates three plateaus 
corresponding to the three evident component of the graph. 

Figj5] depicts the numerical result of our algorithm that the largest modularity is offered by 
f%. Detail partition is represented in Figj6ja), in which we also demonstrate respectively the 
WNDPs of eigenvectors corresponding to the first three nonzero eigenvalues. Obviously, WNDP 
on the bottom-left by the chosen /3 gives a partition matching our visual observation. 

Compared with Dolphin Graph and Political Books Graph, the best WNDP of Capocci 
Graph is the one according to other than fi. This small alternation reminds us of the 
important roles that eigenvalues {A.,} have played in Eq.(|9|). To illustrate this idea, we separate 
small eigenvalues with relatively large ones 

n s n 

Yl A ''0 = E X J"'J + £ -V'> 
j=2 j=2 j=s+l 

where {Aj}f =1 represent relatively small eigenvalues while {Aj}" =s+1 the large ones. Our exper- 
iments suggest that eigenvectors corresponding to {Aj}f =1 may hold the information about the 
community structure of the graph. 

4.4 Computer-Generated Graphs 

Take a rough look at the community structure, we contract each cluster of the WNDP into one 
single node inheriting connections of the cluster. Despite multiple edges, no circles exit in this 
simple structure which, in brief, is a tree. This interesting phenomenon comes from the fact that 
only two signs '+' and '-' are used to identify different nodes. We generate a graph artificially 
following the method used by Girvan and Newman [27J, also called the ad hoc network, in 
which a whole graph of 128 vertices is divided into four communities of 32 vertices each. More 
precisely, in our special case, edges between pairs of nodes in the same community are placed 
with possibility 0.4 while ones in different communities share possibility 0.1. The randomness of 
the edges indicates multiple choices, however, we are only interested in the known community 
structure of these graphs. Graph in FigJT] is one of these special samples following the Girvan 
and Newman's method on which our algorithm is applied on. It is easy to find out that at least 
four signs are required to separate these four communities, which suggests our method being 
incomplete. 
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Calculation shows that WNDPs by eigenvectors according to the first three nonzero eigen- 
values divide the graph into two parts respectively (see FigjT]). In other words, each eigenvector 
reveals part of the community structure. By combining these partial information together, 
we may recover the acknowledge of the whole structure. This leads us to a generalized def- 
inition of the nodal domains: a strong nodal domain of functions , fi\ on V(G) is a 
maximal connected induced subgraph of G on vertices v 6 V which have corresponding vec- 
tors (/i(t>),--- , fi(v)) belonging to the same quadrant of the i-dimension Euclid space. Note 
that 'i' is the exact number of eigenvectors {/i, • • • , fi\ on which we process for the combined 
information. So is the alteration about the definition of weak nodal domains. 

Convictive result comes out by applying our algorithm on these generalized WNDP (see 
FigEJ). Single eigenvector does separate the whole network into reasonable parts, but not subtly 
enough. Structure functioning with a relatively small scale will be demonstrated by appropriate 
blend of certain eigenvectors. 



5 Conclusion 

The experiments above confirm that the nodal domains of the Laplacian eigenvectors do hold 
certain information about the community structure of the graph. Still, exception would arise 
when the situation comes with relatively large difference between the volume numbers of com- 
munities. Virtually, we set up a graph with following three steps (see FigJSJ): 

• Use ER model [15] to build a graph, named G\, of 200 nodes whose average degree is 40 
(nodes shaped in solid squares). 

• Use ER model to build another graph, named G2, of 20 nodes whose average degree is 4 
(nodes shaped in solid circles). 

• Build an extra single node, first connect it with a random node in G2, then connect it with 
five random nodes in G\ (node shaped in triangle). 

Calculation suggests that WNDP by the second eigenvector possesses the largest modularity. 
Solid line in Figj8] shows the exact division. But direct observation tells us the triangle has more 
connection with G\ than that with Gi- 

A survey into the algorithm knowledged us that the process of calculating the eigenvectors 
is equal to evaluate each point with a average of its neighbor's eigenvector value, but under a 
certain rate which is the eigenvalue. Thus, the eigenvector fa of the Laplacian on the graph 
would evaluate nodes in G\ with positive values while negative ones for G^- Because G2 has 
much less nodes than G\, fa on G2 possess relatively small negative values, which indicates that 
under the mean of average the triangle could have a negative value as nodes in G2 do. Further 
work should be focused on the position amendment of the boundary nodes like the triangle. 

In section III, we already find this situation a solution by checking the boundary vertices 
for improvement on modularity Q. But recall the first two operations in modifying Q, we note 
that if the volume of the two communities is small or unbalanced (with wide gap), these two 
should not be separated for the inequality being unsatisfied. That is why the modularity has 
a well-known resolution limit, that makes clusters smaller than a given size undetectable. A 
resolution coefficient A is welcome to make up this limitation as the inequality goes as 

|pf<^A e( 0,l] 



10 



where the vertexes are localized that we only consider the influence form their ranged neighbours. 

To conclude, in this paper we introduce the method that applies weak nodal domains accord- 
ing to eigenvectors of the Laplacian on the graph. We calculate modularity |40(. 141 j to decide 
which WNDP behaves best that it may suggest the optimal number of communities in networks. 
We also test our algorithm in real-world models. As the examples show, for arbitrary graphs, 
WNDP works quite well in deciding the number of clusters in the graphs. For un-weighted graph 
the mechanism why the WNDP works so good is still unknown, and we will work on this in our 
future study as well as finding other good parameters to choose the best eigenvectors. 
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(a) Modularities of WNDPs by different eigenvectors of (b) The colors represent real observed division of 62 
Laplacian on Dolphin graph. WNDP by fa is the best bottlenose dolphins. The solid line represents the result 
choice. of WNDP. Follow our algorithm comes a partition with 

only one controversial node. 



Figure 3: The Dolphin Graph [33 
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(a) Modularities of WNDPs by different eigenvectors 
of Laplacian on Political Books graph. WNDP by ]i is 
selected by the algorithm. 




(b) Books with dark color are liberal, the grey ones are centrist 
or unaligned, and the rests without colors are conservative. The 
grey books act as buffers between the ones with left-wing and 
right-wing points of view. Solid line divides the graph follow- 
ing WNDP of /i. Only two confused nodes representing books 
purchased frequently by both sides are positioned incorrectly. 

Figure 4: The Political Books graph [32J 
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Figure 5: Modularities of WNDPs by different eigenvectors of Laplacian on Capocci graph, 
among which fa suggests the best choice. 
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(a) up-left: the original graph; up-right: partition by (b) The eigenvalues of the Laplacian on Political Books 
/2j bottom-left: partition by fz] bottom-right: parti- graph. Note that the first three eigenvalues are rather 
tion by f^. Generally speaking, eigenvectors other than smaller than the rests, 
these four above would possess more complicated nodal 
domain structures. 



Figure 6: The Capocci Graph [8] 
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Figure 7: The artificial sample assembled by computer. The four communities are easily rec- 
ognized through direct observation. The color shade represents different clusters which are 
suggested by the eigenvectors. Small graphs on the top are separated by WNDPs corresponding 
to the second and fourth eigenvectors respectively, in which only two clusters each are detected. 
Combination of these two results reveals the exact community structure, and they are exact the 
mathematical outcome of generalized WNDP. 



17 



Figure 8: This special graph is established by following the three steps above. The triangle in the 
figure is the key contradiction. Solid line divides the graph by WNDP while direct observation 
shows the triangle is placed incorrectly. 
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